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Abstract* We present an outline of an algorithm to generate 
artificial helioseismic time- series, taking into account as much 
as possible of the knowledge we have on solar oscillations. The 
hope is that it will be possible to find the causes of some of the 
systematic errors in analysis algorithms by testing them with 
such artificial time-series. 
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1. Introduction 

As more helioseismic datasets have become available it has be- 
come apparent that at least some of the analysis algorithms 
used have systematic errors. This is perhaps most obvious 
when one compares frequency splittings from different data 
sets. These splitting show systematic differences, even when the 
observations were taken at essentially the same time, hence rul- 
ing out changes in the true solar rotation rate. Even analysing 
the same dataset with different methods has yielded small but 
statistically significant differences (Bachmann et al. 1993). 

We hope that it will be possible to find the causes of at least 
some of the systematic errors by analysing artificial data sets 
for which the ’true’ mode parameters are known. To do this 
it is necessary that the artificial data closely resemble the real 
observations. The danger in checking analysis programs this 
way is obviously that if one overlooks a crucial property of the 
real data, one may be led to believe that the analysis program 
performs well on the real data when, in reality, it is flawed. On 
the other hand, it might be argued that if one cannot success- 
fully analyse artificial data, for which, in principle, one knows 
all the properties, there is not much chance that real data can 
be reduced without introducing systematic errors. 

In the following we will start with a short summary of the 
relevant properties of solar oscillations and how the oscillations 
are observed. Thereafter we will go into more detail about how 
one can construct artificial time-series given this information. 
In a separate paper we will discuss some of the results obtained 
by analysing artificial data from this program using a number 
of different methods. 
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2. Solar oscillations 

The basic physical properties of solar oscillations and the tech- 
niques used for observing them have been described in a num- 
ber of reviews (Christensen-Dalsgaard & Berthomieu 1991 and 
Hill et al. 1991). Here only the properties important for the 
construction of artificial data will be described. 

Individual modes are generally described by their radial 
order n, degree l and azimuthal order m. For the purpose of 
observing the modes and hence for constructing artificial data, 
the radial order n is only important for determining the fre- 
quency Vnim of the mode. I and m on the other hand determine 
the appearance of the mode on the solar surface and are thus 
more important when reducing observations or generating ar- 
tificial data. The radial component of the velocity (or the in- 
tensity) on the solar surface from a mode with a given (n, /, m) 
is given by 

0 , 0 = ^ [a nfm (0Ti m (^, *)] , ’ (1) 

where Re[ ] denotes the real part of a complex number (Im[ ] 
similarly denotes the imaginary part), and is a spherical 
harmonic given by 

Yr(<f>,9) = . ( 2 ) 

The coordinates <j> and 0 are longitude (For the purpose of 
analysing solar oscillations data, the zero point of longitude is 
usually placed at the Sun’s sub-Earth meridian.) and colatitude 
respectively, and a($) is a time-series describing the (complex) 
mode amplitude. Notice that the definition of the Yj m ’s is not 
the most commonly used one, in that the sign is the opposite 
from the standard definition for odd negative m’s. This has 
no effect on the time-series generated, as the phases of the 
oscillations on the Sun are random. The reason for this sign 
convention is historical. 

The total surface velocity V is given by a sum over all 
modes of the individual mode velocities 

= (3) 

n,J,m 

The time-series a(t) is that of a stochastically excited 
damped oscillator and hence the real and imaginary parts of 
the Fourier transform d(i/) of the time-series a(t) each have 
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Fig. 1. The average theoretical power spectrum of a mode with a HWHM w = l^Hz with (dashed lines) find without (solid lines) time-gaps. 
The left panel uses a linear scale, while the right panel uses a logarithmic scale. The dotted line shows the ungapped power scaled such 
that the peak power is the same as for the gapped power. The visibility function used was from a run with the Fourier Tachometer (which 
was run jointly by HAO and the National Solar Observatories) from the spring of 1989, and should thus represent reed one site data well. 
The duty cycle was 34.94% 


zero mean and a variance given by a Lorenzian profile in fre- 
quency 


Var(i') = 


P/w 

i + (^) 2 ’ 


( 4 ) 


where P is a measure of the average mode power, w is the 
half width at half maximum (HWHM) of the mode, and i/o is 
the mode frequency. If the mode is excited very often during 
its lifetime r = (2xty)" 1 , the values of the discrete Fourier 
transform at different frequencies close to the central peak are 
independent. If the mode is only excited infrequently, the in- 
dividual points in the Fourier transform are not independent 
over a frequency range of the order fetation » w ^ ere ^excitation 
is the typical time between excitations. 

The oscillations are observed by taking images of either 
the intensity or the Doppler velocity at the solar surface (see 
Hill et al. 1991) at regular intervals of typically 1 minute. The 
images are then usually interpolated to a uniform grid in <j> and 
x = cos# and inner products with suitable masks are calculated 
to isolate the different target modes 


,i /•*/ 2 

0l>m (t) = / / K>bs(*, x, x)d<f>dx , (5) 

J-l J-i r/2 

where oj (m is the observed time-series for a target (/, m), V 0 \>& 
the observed surface velocity and Afj m the mask used to isolate 
the (/, m). The reason for the interpolation to a fixed net in lat- 
itude and longitude is that it allows one to compensate for the 
varying B-angle (which is the angle between the solar rotation 
axis and the plane of the sky) and effective P-angle (which is 
the angle on the images between the solar rotation axis and a 
reference direction) without changing the masks as a function 
of time. A uniform net in longitude also allows one to nse a 
Fast Fourier Transform to do the longitude integrals, substan- 
tially decreasing the computational burden. Notice that since 


modes with -fm and — m, but the same /, look identical at any 
given time, it is not possible to separate them at this point 
in the processing; thus only masks with m > 0 are used (see 
later). Also notice that all the n’s at a given (/, m) appear in 
the corresponding time-series. 

For reasons explained later we have chosen to use masks 
given by 

Afr(^*)=^ m (^.x)Ap(r), (6) 

where Ap(r) is an apodization chosen to reduce the contribu- 
tion from the noise close to the solar limb and r = (cos 2 9 + 
sin 2 <j> sin 2 9) i is the fractional radius of the given point on the 
observed image of the Sun. The apodization function Ap, is 
typically 1 inside a certain radius and mono tonic ally decreas- 
ing outside. 

The time- series of each of the target (/, m)’s are then 
Fourier transformed and the complex conjugate of the nega- 
tive frequency part of the spectrum is identified with +m, while 
the positive frequency part is identified with — m. Due to bad 
weather and other problems (such as the Sun being below the 
horizon) the Sun is normally not observed uninterrupted. It is 
thus necessary to zero-fill the time-series, which, unfortunately, 
leads to temporal sidelobes in the Fourier transforms. Recall 
that the Fourier transform of a product of two functions (the 
time-series without gaps and the gaps represented as a visibil- 
ity function with, say, 1 when there is data and 0 when there 
is not) is the convolution of the Fourier transforms of the two 
functions. Since the visibility function tends to be highly pe- 
riodic (at least at moderate latitudes) due to the presence of 
the day /night cycle, its Fourier transform (and thereby power 
spectrum) contains peaks at multiples of lday -2 w 11.57pHz. 
As the values of the Fourier transform of the uninterrupted 
time-series ai the different frequency points are independent, 
the average power spectrum of the gapped time-series is the 
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convolution of the average power spectrum of the ungapped 
time-series (which is twice the variance in Eq. (4)) convolved 
with the power spectrum of the window function. The effects 
of the gaps are thus, as illustrated in Fig. 1, to introduce so- 
called temporal sidelobes in the power spectra and to intro- 
duce correlations among the previously independent points in 
the Fourier transform. Also the power level far from the peak 
is considerably higher relative to the peak when time-gaps are 
present. 

Since Y^ m ’s are not orthogonal on the part of the Sun we 
observe, individual (/, m)’s are not perfectly separated by the 
inner product operation described above. For Doppler velocity 
observations, only the velocity in the direction of the observer 
is detected and hence the effective area of the Sun observed is 
even less than a hemisphere. The observed velocity signal from 
a single mode is 


Vn,l t m,ob&{<t>i x i 0 = \/l — r 2 V nt i t m (</> } 9 , f) 

= x/l-HRe (t)VT(<M)] 

= y/l - r 2 P ( m (x)Re [a nim (l)e im *] 


( 7 ) 


where \A “ is the line of sight velocity projection factor. 
The observed time-series for a target (J, m) is now given by 
a sum over all modes on the Sun weighted by coefficients 
describing the sensitivities to modes characterized by 
(/', m'), when the target mode is characterized by (/, m): 


rl f n/2 

Ol,m(t) = J J Vobs{<t>,x i t)Mr{<f>> x )d<t>dx 


-1 »/-7r/2 

r 1 r n l 2 


fl /"/2 

= / / E 

J - 1 ' 

r\ [7 r/2 

= V' / / { 

n'.i'.m' - / -> J ~* n ^ 

\/l — r 2 Ap(r) i Yr ( <p , x) d(f)dx 

- e r r { 

nM'y J—*/' 2 

[Re(o n /|/ m /(<))cos(mV) -Im(a n q/ m /(f))sin(mV 

[ cos(m^) i sin(m<£)] \J 1 — r 2 Ap(r) | d^dx 

= E f r /2 { ^ m (x)P^(x)Ap(r)v/r 

n' p l',m' ** _1 * l 2 

[Re(a„^/ m /(f)) cos(m^) cos(raV) 

— Im(a n /// m / (f)) sin(m<^) sin(mV)] —d^dz 

” ^ ^ ,m f Re(a n / 1 1 ) m / (0) 

“ic{ |fn ,i' i m'Im (a n \i' im '(f)) } . 


l(«) 


— r 2 


Cl. = 7 f [ 17 { Pt m (x)Pr\x) 

T J - 1 J-nf2 K 

cos(m0) cos(m # ^)Ap(r) \J 1 — r 2 | d^dx 

■ f>nf2 i-Tr/2 

= j J { P i m ( X ) P ? (*) 

cos(m^) cos(mV)Ap(r) sin(0)\/l — r 2 d<£d0 , 


CLmA^m* 


and 


1 f n f 2 r 

= 7 / / { prwp^(x) 

T 7-1 7-^/2 k 

sin(m<^)sin(mV)Ap(r)\/l — r 2 > d^dx 

J (10) 

[ir/2 [*/2 f K f 

= j j { Pr(*)PF (x) 

sin(m^)sin(m , 0)Ap(r) sin(0)\/l — r 2 d<£d0 


,m,J' ,m ; 


f 1 if / -f m + V + m 1 is even. 
1 0 otherwise. 


(H) 


Notice that the tangential component of the surface velocity 
has been neglected in these calculations. For some modes (eg. 
g-modes) it may be necessary to include it if a high accuracy 
is needed. 

The c’s obviously satisfy some symmetry relations: 


c 1 , 771 , 1 * ,m / — c i ; ,m f ,l,m 


t f 

,m f i m / ,f,m 


Cl,— m,V ,m l — ,m’ 


c l , — m,l’ ,m r = Cl , 171 , 1 ’ ,m’ • 

(12) 

For m >> 1 


Ci t m,i' ,m f ^ sign (mm )c( m 

(13) 


Note that these relations for the c’s are only true if the geom- 
etry of the images is correctly understood. If there are scale 
errors, orientation errors, centering errors, or distortions they 
may not hold. The fact that the image is sampled on a fairly 
coarse grid may also introduce inaccuracies. On the other hand, 
they do hold for many other types of masks, as long as these 
have the same symmetric/antisymmetric properties around the 
equator and the central meridian as the spherical harmonic 
masks used here. 

To find the corresponding crosstalks in the Fourier trans- 
forms, suppose a mode (F, m') has a(t) = ae Ia,t (in other words 
looking at a single frequency point in the Fourier transform). 
As previously mentioned, the part of the Fourier transform of 
the observed time-series o used for a given m is the positive 
frequency part |m| if m is negative and the conjugate of the 
negative frequency part if m is non negative. The crosstalk 
to another mode (/, m) can now be found by noting that the 
contribution from the mode (/',m') is given by 

0i.m(O = c i>m( i/ (m /acos(u><) - i< m ,i',m'asin(u;0 (14) 

and that 

o i>m (0 = 6.e' iwt +fc + e iwt , 


In the last equality, the sensitivity coefficients Cj >m tc 
c \ are defined as 


and 


(15) 
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where b _ is the amplitude in the negative frequency part of 
the Fourier transform and 6+ is the amplitude in the positive 
frequency part. This gives (dropping subscripts) 

6_ = i(c + c)a (16) 

and 

6 + = i(c-c>. (17) 

Thus a mode with given (/',m') will show up in the mode 
(/, m) with the amplitude multiplied by 

%Tn >V j 2 . (18) 

The reason for the choice of normalization of the masks is that 
it is convenient that 


c 


+ 

l,m,l ,m 


= 1 


(19) 


for all (/, m) if there is no velocity projection factor, no apodiza- 
tion and the integration is done over the whole Sun. 

Coarse sampling, seeing and any interpolations performed 
on the images tend to smear the images, causing variations in 
response that depend on / and m. It is possible to simulate the 
effect of smearing by convolving the images of the oscillations 
with a point spread function (PSF), but since the smearing has 
to be performed in image coordinates it is fairly costly in terms 
of computing. The most important effect of smearing is to lower 
the sensitivity as a function of wavenumber. A second order 
effect (which is often important, as one normally attempts to 
push the / range) is to increase the leakages relative to the 
target mode as the effective area observed decreases due to the 
forshortening close to the solar limb. 

The important properties of the noise are the temporal cor- 
relation (leading to frequency dependent noise) and the corre- 
lation between the noise in different time-series/Fourier trans- 
forms. The frequency dependent noise comes from the fact that 
most noise sources produce temporally correlated noise and not 
white noise. The resulting spectrum thus depends on the noise 
generation mechanism. Important contributions to the noise 
typically originate in the Sun, in the Earth’s atmosphere, and 
in the instrument. Due to the different generation mechanisms, 
the temporal and spatial properties of these contributions gen- 
erally are different from another. As an example, both the tem- 
poral and the spatial characteristics of various types of solar 
granulation, scintillation in the Earth’s atmosphere and ampli- 
fier noise are very dissimilar. 

The reason for the noise correlation between the different 
time-series is the same as the reason for the leakage into a 
target mode from other modes, namely that the masks used are 
not orthogonal on the observed part of the Sun. The covariance 
between the real parts of the time-series for modes 
with (/, m) and (/', m') is given by 





Af^ m (^,x)M{T (<t>, x)Var(<£, x)d<f>dx 


= eo W.m' jT /2 jT /2 { PTWPF'i*) (20) 

cos(m^) cos(mV)Ap(r) 2 Var(r) sin(0) d<^d0 , 


where it has been assumed that the noise is uncorrelated be- 
tween different points on the Sun and has a variance Var(<£, x) 
that is symmetric around the equator and the central merid- 
ian. The constant eo has absorbed factors of 2 and x and the 
effects of the discreteness of the sampling. The discreteness 
of the sampling and the (lack of) independence between dif- 
ferent points are clearly connected. If the original pixels are 
assumed independent then the integration elements in the pre- 
vious equation cannot be, since different pixels are mapped into 
different sized areas in the <j>-0 plane. In addition to this (in a 
sense trivial) problem, there are other more troublesome ones. 
One is that it is difficult to estimate the covariance matrix; 
another is that the computational burden goes up by severed 
orders of magnitude if a full covariance matrix has to be used. 

For the imaginary parts it similarly follows that the covari- 
ance of the noise time-series is given by 


1.7T/2 PTT j * 

= cW,m' / / { Pr(*)P? (*) 

Jo Jo 1 ( 21 ) 

sin(m^) sin(mV)Ap(r) 2 Var(r) sin(0) j> d<j>d8 . 


It also follows that the noise in the real parts of the time-series 
is uncorrelated with the noise in the imaginary parts if the 
noise is symmetric around the central meridian. Again it turns 
out that the covariance between the same frequency point in 
two different Fourier transforms is proportional to 


, m ' )/2 * ( 22 ) 

Under essentially the same assumptions as those made in the 
calculation of the crosstalks, it may be shown that the same 
symmetry relations (Eqs. (12) and (13)) hold for the e’s. 


3. Generation of artiflcal time-series 

In the previous section we discussed the basic properties of 
the time-series. In this section we present a way to generate 
time-series with prescribed properties. First we will show how 
to generate a time-series for a single mode, then how to gener- 
ate noise time-series, and finally how to combine the different 
time-series, taking into account the crosstalks and the noise 
correlations. 

As noted in the previous section a single mode is well de- 
scribed by a stochastically excited damped oscillator, hence a 
straightforward way of generating time-series is to model such 
an oscillator. It is, however, worthwhile to note that for all 
relevant modes 

A<ob. < r, (23) 

where At 0 b» is the time interval between samplings of the mode 
and r is the mode lifetime. As previously discussed, this means 
that it is not necessary to ‘kick’ the mode each timestep, but 
only much more often than r. Since the generation of the ran- 
dom numbers used to determine the kicks is fairly expensive 
computationally, a considerable reduction in running time can 
be acheived by applying kicks only at every nth observed time, 
with n fairly large. 

Consider a mode with a frequency v , lifetime r and an 
rms velocity amplitude of ‘observed’ with a time ca- 

dence Af 0 b« and with kicks applied with a cadence <kick = 
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tf kick At 0 b. < r. In the simplest form of the algorithm we use, 
the time-series of the mode is started out by setting the initial 
value ao of the time-series to v rm8 (rand -}-i* rand), where rand 
is a normally distributed random number with unit variance. 
The first chunk of the time- series a*, k = 0 , Nkick — 1 is then 
set to 

a* = a© exp(2irmfcA<ob8 — fcAtob*/^) , (24) 

for k = 0 JVi.ick - 1. The initial value a' 0 of the next chunk 

is set to 

a r 0 = ao exp(2iri/iiV k ick Atobs — Akick At 0 bs/r) 

+ akick (rand -F i * rand) ^5) 

= ao exp(2xmt k ick - hick/r) 

+ a k ick (rand 4- i * rand) , 

where a k i c k = Vrm#\/l — exp(— 2t k ick/f) « u rm « y/^tkickfr , 
such that the expectation value of the power stays constant. 
Finally the individual chunks are concatenated to yield the 
complete time-series. Notice that apart from ao all the chunks 
are identical and that it is therefore possible to compute the 
exponential once and for all. Also notice that the time-series 
generated will not have exactly the specified u rma , but rather 
- exp(— f k i ck /r)), as the decay of the mode during 
each chunk has not been taken into account in the calculation 
of the initial ao and a k j c k • 

Kicking the mode only at selected timesteps does lead to 
a slightly distorted line profile, in particular small bumps ap- 
pear around the peak with a spacing of t k - ck . These bumps 
are, however, so small and so far away from the main peak 
that they drown in the noise for a realistic noise level. The 
infrequent driving also leads to correlations between different 
points in the Fourier transforms for a given mode (again at 
a distance of the order but again these should only af- 

fect frequencies far from the main peak. In order to reduce the 
first of these problems, we have not used the method as just 
described. Rather than making the final time-series by con- 
catenating chunks of length tkick > we make them by adding 
overlapping series of length 2 f kick , each multiplied by a trian- 
gular tapering function (see Fig. 2). This means that the kicks 
are applied more gradually, giving less distorted power spectra. 

An example of mode spectra generated this way is shown 
in Fig. 3. 

Frequency dependent noise can be created either by pass- 
ing white (frequency independent) noise through a digital fil- 
ter, or by generating frequency dependent noise in the Fourier 
transform and transforming it back to the time domain. We 
have chosen the latter approach. To avoid doing very long 
Fourier transforms and to avoid storing long noise time-series, 
we create small chunks of noise time-series and concatenate 
them. Since the noise in different chunks is not correlated, the 
noise spectrum will not be correct below frequencies of approxi- 
mately i” 1 - where t noise is the length of the chunks. Since this 
is far below the frequency of the modes for even moderatly long 
series, this should not lead to any major problems, however. 

The correlation of the noise between different time-series/ 
Fourier transforms is slightly more difficult to handle. To gen- 
erate time-series y x with a prescribed covariance matrix E one 
starts with uncorrelated times series r, with unit variance and 
sets y = Gx at each time step, where E = GG r . One way to 
find G is to perform a Cholesky decomposition of E (see eg. 


Golub & Van Loan 1989), where G a lower tridiagonal matrix. 
That this produces the desired covariance follows from 

Cov(yi,y J ) = = ^^GjkGjk 

* * (26) 

= Y,Gik(G T ) kj = (GG T ),j = Ea , 

k 

and is the traditional way of generating vectors with a pre- 
scribed covariance matrix. 

A problem with generating the noise time-series this way 
is that, in order to get the noise correlations between distantly 
spaced Vs correct, one should generate all noise time-series for 
all Vs and m’s and use the covariance matrix for all i’s and m’s. 
This is clearly not realistic; the covariance matrix is far too big. 
Even for one f it is problematic to use the matrix for all m’s 
when / is large, as the condition number for E becomes very 
large (meaning that the covariance matrix is close to singular). 

A potential solution to this problem is to note that when 
the integrals in Eqs. (20) and (21) are discretized one obtains 
E = AA t for some matrix A (which generally has a very 
large number of columns). If the Singular Value Decomposi- 
tion (SVD) of this matrix is A = UHV T then 

E = AA t = UEV t VEU t = UT.EU t = U'U ,t , (27) 

where U* = UH can be used the same way that G from the 
Cholesky decomposition was used before. This U* can be trun- 
cated by only including the highest singular values. It can be 
shown that the error in E introduced by truncating U 1 goes 
like the sum of the squares of the neglected singular values, 
and is thus presumably insignificant. In any case the errors 
introduced this way are probably of the same magnitude as 
those in the Cholesky decomposition, if not smaller, if enough 
singular values are retained. In addition the truncated U ma- 
trix has fewer coloumns than G and thus a smaller x vector 
can be used. If this method is used it should be possible to 
cover a substantial /-range consistently without running into 
numerical problems, but this method has not been tested. 

Finally the finished time-series are produced by adding up 
the modes according to Eq. (8) and adding the noise as just 
described. 

4. Discussion 

It would clearly be useful to be able to compare the artificial 
data to real data to see if we have indeed been able to reproduce 
the essential properties, such as crosstalks, mode frequencies, 
amplitudes and linewidths. Apart from checking such obvious 
things as that the overall levels of modes and noise are correct 
and looking at the linewidths of the produced modes, it is, 
however, very difficult to check whether the properties of the 
artificial data are identical to those of the real data. 

Unfortunately many of the possible comparisons depend 
critically on knowing parameters for the real data that are dif- 
ficult to determine and are largely irrelevant for the purpose 
of testing analysis programs. Examples of such parameters are 
exact mode frequencies, linewidths, and splittings, which, al- 
though they may be the final goal of the analysis, are probably 
not critical to model exactly as long as they are aproximateiy 
correct. For instance, if all frequencies are in error by say 1/iHz, 
this is unlikely to make any difference for the purpose of testing 


FiK 2. The principle behind the triangular functions. Three consecutive chunks and their corresponding envelopes have been shown together 
with their sum. A mode with a frequency of 3000pHz and a chunk length of 2 minutes has been used. For clarity no decays or excitations 
were used. Also the coarse sampling of the modes generally used was ignored 



l/(yU.Hz) 


Fig. 3. The average of 1000 power spectra for a mode with a frequency v = 3017pHx and a HWHM w - lpHz. The theoretical limit (scaled 
to match the power level) has been indicated by a dashed line (which is not visible) 
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analysis programs, as long as the frequency spacings are close 
to their observed values. 

A section of the p-mode spectrum for artificial data pro- 
duced using this code and the corresponding spectrum from 
a run with the Fourier Tachometer are shown in Fig. 4. For 
illustrative purposes, the power spectra were collapsed in m, 
shifting the spectra for the individual m’s to remove an approx- 
imation the the frequency shifts caused by the solar rotation. 
The different plots of artificial data in Fig. 4 show where each of 
the peaks in the spectrum of the real data comes from, and il- 
lustrate the complexity introduced by crosstalks and time gaps. 
The increased complexity introduced by time-gaps is, indeed, 
the strongest argument for projects such as GONG and SOHO, 
which aim to reduce the time-gaps to negligible proportions. 

As can be seen from the third and fourth plots in the right 
hand column of Fig. 4, the real and the artificial data look very 
much alike. Note that it was not attempted to match the noise 
level perfectly between the real and artificial data. Although 
the general appearance of the real and artifical data are very 
similar, there are minor differences, such as different ratios of 
the power between peaks. Among the reasons for these differ- 
ences are the different signal and noise levels, slightly different 
dependencies of the mode amplitudes and linewidths with fre- 
quencies, slightly different crosstalks due to (for instance) prob^ 
lems in estimating the PSF, and slightly different a- coefficients. 
Also the fact that modes with A/ > 4 were neglected in the 
artificial data may be the cause of some of the differences. 

Among the more subtle things that can be checked is 
that the correlations in both the time-series and the Fourier- 
transforms are as in the real data. A scatter plot of the val- 
ues in a real and artificial time-series for (/,m) = (30,0) and 
(/, m) = (30, 2) is shown in Fig. 5. Notice that the time-series of 
modes with A/ = 0 and Am = 2 are generally anti-correlated, 
as the corresponding P ( m ’ s have opposite sign around the equa- 
tor (x = 0), where the weighting by the masks is the highest. 
In order not to have the plot dominated by the low frequency 
noise (which is different), the time-series were high pass fil- 
tered. Again it can be seen that the real and artificial data 
behave similarly. However the match is not perfect, in particu- 
lar it appears that the overall power level is somewhat higher 
for the real data. 

From looking at observations, it turns out that the distri- 
bution of noise on the Sun is not a smooth function of radius 
only, but often has additional contributions around active re- 
gions, in addition to various bizarre instrumental effects. This 
makes it very difficult to reproduce the correlations in the ob- 
served noise. It is of course possible to use the observed cor- 
relations from real data to generate the artificial noise, but 
this approach has other problems. Also, it is far from obvious 
that the noise properties are the same at the frequencies where 
they are easy to measure (typically low frequencies) as they 
are where one cares about them (in the p-mode band). Differ- 
ent noise sources are likely to have very different spatial and 
temporal characteristics, leading to variations in the correla- 
tions with frequency. Also, a large contribution to the noise in 
the p-mode band seems to be unresolved modes and temporal 
sidelobes of those. As can be seen from Fig. 4, the addition 
of a realistic noise level makes a very small difference to the 
apparent noise level at the peak of the p-mode power distribu- 
tion, especially if time gaps are present. When looking at the 
data without time gaps (which would presumably be similar to 
that obtainable with the GONG network), it is important to 


note that the contributions from modes with Af > 4 have been 
neglected, and that, given the level of the modes with A / — 3, 
they are likely to contribute significantly to the apparent back- 
ground noise level. 

In a sense one of the best checks of whether the essential 
properties have been modelled properly is to check that various 
analysis programs give mode parameters close to the input 
values. Unfortunately this does not prove (or disprove) that 
the underlying assumptions are correct, only that they have 
been consistently implemented between this program and the 
analysis programs. This is of course not totally useless, but it 
is not entirely satisfactory either. 

All in all it thus appears that the best one can do is use 
the program assuming that things have been properly imple- 
mented. If it later turns out that the analysis program behaves 
differently on real and artificial data by, for instance, indicating 
that the statistical properties are different, it will be necessary 
to find out which property of the real data was neglected or 
incorrectly implemented. 

Despite these problems with verifying the program, it has 
been an extremely useful tool for testing our different analysis 
procedures. One of the very useful features is the ability to turn 
various properties on and ofT. In particular it is useful to be able 
to turn off the noise, the time gaps and/or various parts of the 
crosstalks, as those can easily lead to problems if not properly 
taken care of. Some of the effects of turning various properties 
of the time series on and off have been shown in Fig. 4. As can 
be seen the spectrum is considerably simpler when crosstalks 
and/or time gaps are neglected. 

Using this program we have been able to identify problems 
in some of the earlier versions of our analysis codes in the deter- 
mination of the mode linewidths and the a-coefficients. These 
parameters are very sensitive to certain errors in modelling the 
crosstalks, as the individual m’s are generally not resolved in 
the power spectra. 

Also, when we have been concerned that some particular 
neglected effect has been causing problems, or when modifica- 
tions to the programs have been made, it has been extremely 
useful to know what the correct results were. If one had had 
to rely on real data for these tests, it would only have been 
possible to see that some parameter changed, and not whether 
the change improved the results or made them worse. 

A description of one analysis method and some of the re- 
sults obtained by analysing artificial data generated by the 
program described here can be found in Schou (1993). A more 
systematic comparison of results from analysing the output of 
this program using a number of different analysis methods is 
in preparation. 

It is possible to address some problems using this program 
that it is not possible to treat using programs taking fewer of 
the physical properties of the modes into account (eg. Anderson 
et. al. 1991). These include the effects of crosstalks, both when 
it comes to introducing interfering peaks from neighbouring 
modes and correlations among the time-series, and the effects 
of correlated noise. These are problems that are likely to affect 
the mode linewidths and the so-called a-coefficients (describing 
the effects of asphericities on the mode frequencies). 

Also note that the dependency of the points in the Fourier 
transform caused by the gaps in the time-series are treated 
properly here, which is not the case if the Fourier transforms 
are generated by multiplying white noise by the limit spec- 
trum (from Eq. (4) convolved by the sidelobe spectrum of the 
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Fig. 4. Examples of power spectra for l = 30. The power spectra for the individual m’s have been shifted according to the “-coefficients 
used to generate the artificial data and summed over m. Only a small fraction of the spectrum around n = 12, which is close to the peak of 
the power in the p-mode band, has been shown. The left column of plots show data without time-gaps, the right hand column shows plots 
with the time-gaps used for Fig. 1. The top row shows the spectrum with no crosstalk from neighbouring I’s and no noise. The second row 
shows a spectrum in which crosstalk out to A 1 of 3 has been included, but still without noise. The third row is similar to the second, except 
that a noise level similar to that from observations with the Fourier Tachometer has been added. The bottom row shows a spectrum using 
real data from the run with the Fourier Tachometer used for Fig. 1. Unfortunately, it was not possible to make the lower left hand plot 
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Fig. 6 . Scatter plots showing the values of points in a time-series for (/,«) = (30,2) versus those for a t.me senes with (l,m) - (M.0). 
The left hand plot shows results from real data, while the right hand plot shows artificial data. In order to reduce the effect of the low 
frequency noise, the time-series were high pass filtered. Notice that the overall power level is not matched perfectly 


time gaps). One time when these correlations are important is 
when estimating the errors on the fitted parameters. If the cor- 
relations are not taken into account, the values of the Fourier 
transform in the side lobes and the main lobe are independent, 
allowing lower standard errors on the fitted parameters than if 
the correlations are properly modelled. 

Unfortunately, we have not been able to eliminate all prob- 
lems from our analysis procedures by using artificial data. In 
particular it appears that we have a systematic problem with 
our determination of the a-coefhcients when analysing observa- 
tions taken with the Fourier Tachometer (see Bachmann et al. 
1993). Despite extensive tests using artificial data as described 
here, we have not been able to find a problem in our analysis 
programs. On the other hand if it had not been for these test 
we would probably not have been able to convince ourselves 
that the problem is not in the time-series analysis, given that 
this is by far the most complicated part of the analysis. We are 
therefore inclined to believe that the problem is in our under- 
standing of either the physics or the instrument, rather than 
in the time-series analysis programs as such. 
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